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Abstract — Spatial interference avoidance is a simple and ef- 
fective way of mitigating interference in multi-antenna wireless 
networks. The deployment of this technique requires channel- 
state information (CSI) feedback from each receiver to all 
interferers, resulting in substantial network overhead. To address 
this issue, this paper proposes the method of distributive control 
that intelligently allocates CSI bits over multiple feedback links 
and adapts feedback to channel dynamics. For symmetric channel 
distributions, it is optimal for each receiver to equally allocate the 
average sum-feedback rate for different feedback links, thereby 
decoupling their control. Using the criterion of minimum sum- 
interference power, the optimal feedback-control policy is shown 
using stochastic-optimization theory to exhibit opportunism. 
Specifically, a specific feedback link is turned on only when the 
corresponding transmit-CSI error is significant or interference- 
channel gain is large, and the optimal number of feedback bits 
increases with this gain. For high mobility and considering the 
sphere-cap-quantized-CSI model, the optimal feedback-control 
policy is shown to perform water-filling in time, where the number 
of feedback bits increases logarithmically with the corresponding 
interference-channel gain. Furthermore, we consider asymmetric 
channel distributions with heterogeneous path losses and high 
mobility, and prove the existence of a unique optimal policy for 
jointly controlHng multiple feedback links. Given the sphere-cap- 
quantized-CSI model, this policy is shown to perform water- 
filling over feedback links. Finally, simulation demonstrates that 
feedback-control yields significant throughput gains compared 
with the conventional differential-feedback method. 

Index Terms — Interference channels, array signal process- 
ing, stochastic optimal control, feedback communication, time- 
varying channels, dynamic programming, Markov processes 



I. Introduction 

Interference limits the performance of decentralized wire- 
less networks but can be effectively mitigated by multi- 
antenna techniques, namely spatial interference cancelation 
and avoidance. In a frequency-division-duplexing network, 
spatial interference avoidance at interferers requires feedback 
of interference channel state information (CSI) from all in- 
terfered receivers, called cooperative feedback. Given finite- 
rate cooperative feedback, CSI quantization errors result in 
residual interference. Suppressing such interference requires 
high-resolution feedback over a network of feedback links. 
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resulting in overwhelming network overhead. This calls for re- 
search on intelligent feedback control that optimally allocates 
feedback bits over multiple feedback links and adapts feedback 
to channel dynamics, which is the theme of this paper. 

A. Prior Work 

Extensive research has been carried out on designing 
feedback-CSI-quantization algorithms for multi-antenna sys- 
tems, called limited feedback [1], based on different ap- 
proaches including line packing [2] and Lloyd's algorithm [3]. 
Besides quantization, another effective approach for compress- 
ing feedback CSI is to explore CSI redundancy due to the 
wireless-channel correlation in time [4], [5], frequency [6], and 
space [7]. Though a few feedback bits suffice in a point-to- 
point multi-antenna system, the feedback requirement is more 
stringent in multi-antenna downlink where CSI errors cause 
multiuser interference [8]. This motivates the joint design of 
CSI feedback and scheduling algorithms to exploit multiuser 
diversity for reducing the required numbers of feedback bits 
[9]-[12]. Both high-resolution feedback for multi-antenna 
downlink and progressive feedback for correlated channels 
require CSI feedback with adjustable resolutions. This is 
realized using hierarchical CSI-quantizer codebooks [13], [14] 
or systematic codebook generation [15], [16]. The current 
work also concerns variable-rate feedback but focuses on 
feedback control rather than codebook designs. 

Recent research on limited feedback explores more complex 
network topologies. In [17], the decentralized wireless net- 
works based on interference alignment [18] are considered, 
and the required scaling of the numbers of feedback bits 
with respect to the signal-to-noise ratio (SNR) is derived such 
that the channel capacity is achieved for high SNRs. The 
Grassmannian codebooks designed for point-to-point beam- 
forming systems with limited feedback is shown in [19] to be 
suitable for multiple-input-multiple-output (MIMO) amplify- 
and-forward relay systems. The algorithms for cooperative 
feedback from the primary user to the secondary user are 
designed in [20] for implementing cognitive beamforming in 
two-user cognitive-radio systems. Moreover, Lloyd's algorithm 
is applied in [21] to jointly quantize the CSI sent by a mobile 
to the desired and interfering base stations. The above prior 
work does not explicitly optimize the tradeoff between the 
network performance and the amount of CSI overhead. 

In wireless networks, excessive CSI feedback yields 
marginal performance gain per additional feedback bit but 
insufficient feedback causes unacceptable performance degra- 
dation. Therefore, feedback control is a pertinent issue for 
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designing efficient wireless networks. In [22], CSI feedback 
rates are optimized for maximizing the sum throughput in a 
two-way beamforming system where a pair of transceivers 
exchange both data and CSI. For a transmit beamforming 
system, bandwidth is optimally partitioned for CSI feedback 
and data transmission [23]. For point-to-point multi-antenna 
precoding, sub-optimal algorithms have been proposed to ad- 
just the CSI-codebook size according to the channel state [24] 
or jointly with the feedback interval based on channel temporal 
correlation [25]. The problem of splitting the sum-feedback 
rate by a mobile for multiple cooperative-feedback links to 
interferers is studied in [26] in the context of base-station 
collaboration. It was shown that more feedback bits should 
be sent to nearer interfering base stations so as to reduce the 
throughput loss caused by feedback quantization. The splitting 
of the sum-feedback rate among multiple users in a multi- 
antenna downlink system was investigated in [27], where the 
optimal feedback rate for a user is shown to increase log- 
arithmically with the target signal-to-interference-plus-noise 
ratio (SINR). The feedback-bit allocation considered in prior 
work is mostly static, targeting dedicated feedback channels 
in cellular networks [28]. In decentralized networks where a 
feedback channel is shared by multiple users, more efficient 
feedback-allocation should be adapted to channel dynamics, 
motivating the event-driven feedback and stochastic feedback 
control. 

B. Contributions and Organization 

This work adopts the approach of stochastic feedback 
control proposed in [29] but targets more complex systems. 
Specifically, this paper concerns the iC-user multiple-input- 
single-output (MISO) interference channel where there is an 
event-driven feedback controller at each receiver. The feed- 
back controller dynamically and distributively determines the 
CSI feedback rate for each feedback link according to local 
CSI. As a result, each feedback controller serves multiple 
cooperative-feedback links in the current system rather than 
a single feedback link to the intended transmitter as in [29]. 
^ Furthermore, we generalize the on/off feedback control in 
[29] to the variable-rate feedback control. 

This work establishes a novel approach of using stochastic 
feedback control to achieve the optimal tradeoff between 
the CSI-feedback overhead and sum interference power in 
the K-user multi-antenna interference channel. The feedback 
controllers are designed based on several key assumptions. 
Channel coefficients are assumed to be independent and identi- 
cally distributed (i.i.d.). The expectation of a CSI quantization 
error is assumed to be a monotone decreasing and convex 
function of the number of feedback bits, which is consistent 
with the popular CSI-quantizer models in [30], [31]. Moreover, 
the channel parameters, namely channel gains and transmit 
CSI (CSIT) errors, are assumed to vary in time following 
Markov chains. The channel temporal correlation is further 

^In this paper, we focus on cooperative feedback with some discussion of 
direct-hnk feedback, namely CSI feedback from receivers to their intended 
transmitters. Hereafter, cooperative feedback is referred to simply as feedback 
whenever there is no confusion. 



characterized by two assumptions. Given no feedback, sam- 
ples of the channel-parameter processes conditioned on large 
past realizations stochastically dominate those conditioned on 
small ones; given feedback, the tail probability of the CSIT 
error is a monotone decreasing and convex function of the 
corresponding number of feedback bits in the past slot. The 
channels are assumed to follow independent block fading for 
the limiting case of high mobility. Based on these assumptions, 
the key findings of this work are summarized as follows. 

- Under an average sum-feedback-rate constraint, a feed- 
back controller is designed as a Markov decision pro- 
cess with average cost. By channel synmietry, it is 
optimal for each controller to equally split the average 
sum-feedback rate for all feedback links, reducing the 
problem of optimizing the multiple-feedback-link control 
policy to the single-feedback-link-policy optimization. 
The optimal policy for minimizing the average sum- 
interference power is shown to exhibit opportunism. 
Specifically, feedback should be performed only when 
the corresponding interference-channel gain is large or 
the CSIT error is significant. Upon feedback, the optimal 
number of feedback bits for each feedback link increases 
with the corresponding interference-channel gain but is 
independent with the observed CSIT error. 

- For high mobility and considering the sphere-cap- 
quantized-CSI model [9], [30], more elaborate proper- 
ties of the optimal feedback-control policy are derived. 
Specifically, it is shown that the number of feedback 
bits for each feedback link follows water-filling in time 
and is proportional to the logarithm of the corresponding 
interference-channel gain. 

- We also consider asymmetric channel distributions where 
interference-channel gains are scaled by heterogeneous 
path losses. For high mobility, the problem of feedback- 
control-policy optimization is decomposed into a mas- 
ter problem that optimally allocates average feedback 
rates for multiple feedback links, and a sub-problem 
that optimizes the policy for controlling the feedback- 
bit allocation in time for a particular feedback link given 
an allocated average feedback rate. This decomposed op- 
timization problems are proved to yield a unique optimal 
policy. Furthermore, given the sphere-cap-quantized-CSI 
model, the optimal feedback-control policy is shown to 
perform water-filling over feedback links. 

The remainder of this paper is organized as follows. The 
system model is described in Section II. The problem formu- 
lation for the optimal feedback control is presented in Sec- 
tion III. The optimal feedback-control policies for the general 
case and the limiting case of high mobility are analyzed in 
Section IV and V, respectively. In Section V-B, the design of 
the feedback controller for asymmetric channel distributions 
is discussed. Simulation results are presented in Section VI. 

II. System Model 

We consider the ilT-user MISO interference channel as illus- 
trated in Fig. 1 . Provisioned with L antennas, each transmitter 
sends a single data stream to an intended receiver using 
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Fig. 1. The K-usqt MISO interference channel with cooperative feedback 

beamforming. As illustrated in Fig. 2, time is slotted and 
each slot is divided into the feedback phase (feedback control 
and cooperative CSI feedback) and the data phase (data 
transmission). Each system parameter affected by feedback is 
represented by the same symbol without and with the accent 
" ^ " , corresponding to the beginnings of the feedback and data 
phases, respectively. Moreover, the subscript t denotes the slot 
index. 

A. Zero-Forcing Transmit Beamforming 

Each transmitter uses beamforming to null interference to 
{K — 1) unintended receivers. Let h[^"^^ denote the L x 1 
vector representing the channel from transmitter n to receiver 
m. To facilitate exposition, we decompose h["^^^ as h\^^^ = 

\mn] \mn] , \mn] m [mnlno • .1 i 1 • 

' gl ^sj ^ where gl ^ = ||hj is the channel gam 

and s[^^^ = h[^"^^/||h[^"^^ II specifies the channel direction. 
Transmitter n applies zero-forcing beamforming by choosing 
its beamformer f]^^ to be orthogonal to the interference- 
channel directions. As a result, K links are decoupled if 
all transmitters have perfect CSIT of the channels to their 
interfered receivers. 

Consider the scenario where transmit beamforming at a 
transmitter relies on finite-rate CSI feedback from interfered 
receivers. Let u[^^^ with unit norm denote the CSIT at 
transmitter n updated by the feedback of s^^"^] from receiver 
m. Then the zero-forcing beamformer f]"^^ at transmitter n 
satisfies the constraints: (f]"^^)^u[^^^ = for all m ^ n, which 
requires L > K. Under the finite-rate feedback constraints, 
imperfect CSIT results in residual interference between links. 
The interference from transmitter n to receiver m has the 
power 
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where unit transmission power is used by all transmitters. 

B. Variable-Rate Feedback Control 

In the feedback phase of every slot, each receiver, say 
receiver m, sends the quantized version s^^"^^ of s[^"^^ to 
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Fig. 2. Variable-rate feedback control 

interferer n in a variable-length packet comprising ^i^^] bits. 
The variable-rate feedback is modeled as B^^^^ G B, where 
IB is a set of nonnegative integers including that corresponds 
to no feedback. As illustrated in Fig. 2, a feedback controller 
at each receiver controls the number of feedback bits sent to a 
particular interferer by observing the interference-channel gain 
and the CSIT error that is defined as follows. The dynamics 
of the CSIT u[^"^^ at transmitter n can be specified as 



B 



> 



(2) 



The CSIT error is defined as J| ^ = 1 

with ^1^^^ = for the case of perfect CSIT: 
[2]. The feedback controller at receiver m observes the 
state I ^^i^^^, 4^^^) I n 7^ m| and generates the feedback 
decision j^^^^^^ | n 7^ m|. Similarly, we define the CSI- 
quantization error as ^ 



1- (s[^"])tsM 



Assumption 1. The conditional expectation E ^i^^^ \ Bf^ 
is a monotone decreasing and convex function of ^^^^^ . 

Example 1 (Sphere-cap-quantized-CSI model). The quantiza- 
tion error e^'^'^^ is modeled in [9], [30] to be uniformly dis- 
tributed on a sphere-cap in with the following distribution 
function 



Pr e^' 



< T I B^ 



2^ r^-\ 0<r<2- — 



[run] 



1, 



Otherwise. 



(3) 



Using this model, the expectation of e^^^J is obtained as 



Bi^ 



L-1 Birr^r.] 



(4) 



which is a monotone decreasing and convex function of ^t^^] , 
consistent with Assumption 1. 

Example 2 (Random- vector quantization). As shown in [31], 
the use of a random beamformer codebook of i.i.d. and 
isotropic unitary vectors results in the following distribution 
of 



Pr e^' 



> r I 



1 



^L-l\2 



(5) 
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and the expectation 



= 2^^^^^ beta (2^^^^^^ ^ 



L-1 



ae-^^^^'^^ (6) 



where beta(-, •) denotes the beta function and a is a constant. 
The last expression is a monotone decreasing and convex 
function of B^'^'^\ justifying Assumption 1. 



Next, it follows from (2) that 



(7) 



given that channels remain constant within each slot as as- 
sumed in the sequel. Note that g^'^^^ = ^[^^] since it is 
unaffected by feedback. 

Finally, it is important to note that besides CSI, controlled 
feedback requires addition bits for specifying the number 
of feedback-CSI bits since it varies with the channel state. 
Such overhead is unnecessary for feedback schemes with fixed 
numbers of feedback bits (see e.g., [2], [5]). Let D denote the 
number of available decisions for the feedback-control pohcy. 
Assuming that the policy is known to the transmitter, the total 
number of feedback bits from receiver m to transmitter n in 
the t-th slot is bI^^^ + [log2 D] . It is observed from simulation 
that D for the optimal policy is relatively small e.g., 3 or 4 
(see Fig. 4). 

C. Channel Model 

Channels vary with time but remain constant within each 
slot. For simplicity, all channel coefficients, namely the el- 
ements of the vectors |h^^"^^|, are assumed to be samples 
of i.i.d circularly-symmetric complex Gaussian processes with 
unit variance, which is denoted as CA/'(0, 1) (asymmetric 
channel distributions are considered in Section V-B). Note that 
as a result of channel isotropicity, the two channel parameters 
^l^n] ^[rnn] independent conditioned on b\^'^\ The 
channel temporal correlation is modeled using the following 
two assumptions. 

Assumption 2. Each channel coefficient evolves as a Markov 
chain. Given B^^^ = 0, the distributions of {g^^^^ , 6^^^^ ^ 

conditioned on [g^^^ , 5^^^ ^ satisfy 
Vi\5\ >Ti\8\_^' =aij>Pr >ti\8\_^' =bij 

-T) f [mn] ^ I [mn] A \ "d f I'mn] ^ i [mn] i \ 

Pr (^^i >r2\g[_i' =a2j>Pr (^^i >T2\gl_i' = 62 j 
if cii ^ bi and a2 > where < ti < 1 and T2 > 0. 

The above assumption states that given no feedback, large 
CSIT error and channel power in the current slot are likely to 
stay large in the next slot due to channel temporal correlation. 

Assumption 3. For B^^^ > 0, the conditional distribution 



Pr 



{si" 



>t\B 



function of B^^^\ 



^Jl^ ^ is a monotone decreasing and convex 



Note that upon feedback, ^^^^ is independent of 5^^^ as a 
result of (7). 

Finally, for the limiting case of high mobility, channels are 
assumed to follow independent block- fading channels. For this 
case. Assumption 2 and 3 are trivial and not required in the 
analysis. 

D. Performance Metric 

The objective for designing the distributed feedback con- 
troller at each receiver is to minimize the average interference 
power. For receiver m, this metric is given as 



/H = liin -E 



T K 
t=l n=l 



(8) 



with given in (1). Minimizing I^^^ suppresses the system 
performance degradation caused by quantizing feedback CSI 
e.g., the throughput loss in the following example. 

Example 3. Let S^^^ and /]^^ denote the signal and inter- 
ference power received at receiver m in slot t, respectively. 
Assuming Gaussian signaling and high mobility, the through- 
put loss of the m-th data link is given as [8] 




I0g2 1 



(9) 



where is the variance of a sample of the additive-white- 
Gaussian-noise process and (9) uses Jensen's inequality. It can 
be observed from (9) that minimizing an upper bound on the 
throughput loss is equivalent to nunimizing /[^^ . 

III. Problem Formulation 

The design of the feedback controller is formulated as 
a stochastic optimization problem under an average sum- 
feedback constraint. 

The cost function and state space for feedback control are 
defined as follows. To this end, the channel shape s^"^"^^ is 
decomposed as 



,[mn] 



1 



ffmnl - \mn\ 



= A/1 



e[mn] [mn] 



/ r\ran\ \mn\ 



where q^^^^ and q^^"^^ are unitary vectors orthogonal to u^^'^^ 
and u[^^^, respectively. Based on the above decomposition. 



we can define the channel parameters p\ 



and ^ ' 



. Using these parameters and 



substituting (1) allow /^^^ in (8) to be written as 



lim — E 



T 



^ ^ E \gr'P\ 

t=l n^m 



mn\ nimnl c-\mn\ 



(11) 
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where and b[^^ = j^^^^] | n 7^ m| denote the 

state and decision of the feedback controller at receiver 
m, respectively. From (11), the controller's state should 
be intuitively chosen to comprise all channel parameters 



I ^|mnj ^ ^|mnj ^ ^|mnj | ^ _^ ^1 Howcvcr, this rcsults in the 
coupling of feedback control at different receivers. Specifi- 
cally, by definition, the state parameter j3^^^'^ depends on the 
beamformer f]^^ that in turn is computed based the feedback 
CSI from the receivers {m \ m ^ n}, and each of these 
receivers also controls other beamformers. Therefore, to enable 
distributive feedback control, is excluded from the 

controller's state and hence x\^^ = |^|^^] ^^|^^] | ^ ^| 

where each parameter pair {g^i^^\ sl^^^) depends only on the 
single channel h[^^^. Since all channel vectors are isotropic 
and that feedback control is independent of 

and pI^^^ can be shown to be beta(l, L — 2) random variables 
and independent with {gt^^\^t^^^) [9]. This simplifies (11) 
as 



j[m] 



1 1 

lim — X 



T 

t=l n^m 
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mn\^[mn\ , [m] -p[m] 
t \ -^t 



(12) 



Consider a stationary feedback-control policy and an aver- 
age sum-feedback constraint where the total average feedback 
rate for each receiver is no more than 6 > 0. The optimal 



policy P;^ : X 



[m] 



b[^^ at receiver m solves the following 



infinite-horizon stochastic optimization problem: 



. 2 



minimize: I^'^\Vm) 



subject to : lim — E 



< b. 



(13) 



Due to synmietric channel distributions, it is optimal for 
receiver m to equally split 6 for (i^ — 1) feedback links. 
Consequently, the optimization of Vm reduces to that of the 
policy V for controlling an arbitrary single feedback link. To 
simplify notation, define the random process {gtyStySt, Bt) ~ 

/ \mn] rlmn] rlmn] 7-)[mnl\ i C4 55 j. i-^ 

[gl ,ol ,Bl ^ ) where ~ represents equality m 

distribution, and the metric 



lim ^ 

T^oo T 



(14) 



t=i 



where xt = {gt.^t) is the state of a single-feedback-link 
controller. Then V : Xt ^ Bf can be designed by solving 

^It is also possible to formulate the optimal feedback control as a finite- 
horizon stochastic optimization problem. However, the current infinite-horizon 
formulation not only leads to a stationary control policy but also allows 
tractable analysis of the policy structure. Furthermore, the infinite-horizon 
approximation is justified by that a communication session in a practical 
system such as 3GPP LTE usually spans over thousands of frames. 



the following optimization problem: 
minimize: J{V) 

1 



subject to : 



lim 

T^oo T 



E^* 

.t=i 



< 



(15) 



K 



IV. The Optimal Feedback-Control Policy 

In this section, we derive the optimal feedback-control 
policy for general mobility. Given channel Markovity, the op- 
timization problem in (15) can be transformed into a stochastic 
optimization problem as follows. By applying Lagrangian- 
multiplier theory, there exists a Lagrangian multiplier A > 
such that the optimal policy that solves (15) also minimizes 
the following Lagrangian function: 



£(P) = lim U 



■ T 

E 

.t=i 



(E [gA\xt,Bt]+XBt) 



. (16) 



Minizing C{V) is an average-cost stochastic optimization 
problem with a continuous state space. Though there exists 
no systematic method for solving this problem, it can be 
approximated by a discrete-space counterpart whose solution 
can be computed efficiently using dynamic programming [32]. 
The required state-space discretization is discussed and the 
resultant optimal feedback-control policy analyzed in the fol- 
lowing subsections. 

A. State-Space Discretization 

The spaces of the feedback-controller's state parameters gt 
and St are discretized separately. The set ^ = {gt > 0} 
is partitioned into M line segments [^1,^2), fe^^s), 
[gM, 00) with gi =0 and < ^1 < ^2 < • • • < gM- These line 
segments are represented by a set of M grid points Q = {gm} 
with gm e [^m,^m+i). Specifically, G ^ is mapped to gm 
if gt lies in the m-th line segment. Similarly, we divide the 
set V = {0 < St < 1} into N line segments [^1,^2)? [^25^3)? 
• • • , [Sn, 1] with ^1 = and < ^1 < ^2 < • • • < < 1 
and represent these segments using a set of grid points 
^ = {^n} with Sn G [Sn: Sn-\-i)- The Optimization of the grid 
points Q and V is outside the scope of this paper. Last, the 
discrete state space is represented by X = Q x V. 

The discretized version of the controller state Xt is denoted 
as Xt = {g^St}. Given Assumption 2, {gt} and {St} are two 
Markov chains whose transition probabilities are obtained as 
follows. Let Pn^i{B) denote the probability for the transition 
of S from the state n to ^ given the feedback decision B. Then 
Pn/ can be written as 

Pn^t{B) = Pr(^t+i = Si\St = Sn,Bt = B) 

where 1 < n,£ < N. Similarly, let Pm,k denote the transition 
probability for {gt}, which is given as 

Pm,k = Pr(^t+i =gk\gt = gm) 

where 1 < m, /c < M. Note that given B, Pn/{B) and Pm,k 
are independent as a result of channel isotropicity. Last, the 
transition kernel for the controller- state Markov chain {xt} is 

{Pm,k} X {PnAB)}- 
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B. The Structure of the Optimal Feedback-Control Policy 

The stochastic optimization problems for feedback control 
with the discrete state space A' are formulated as follows. De- 
fine the corresponding feedback-control policy as P : ^ ^ B. 
The matching average cost function C is modified from (16) 

as 



C(V) = lim -E 



.t=i 



(17) 



where G{xt,Bt) is the cost-per-stage obtained using (7) as 

^ gtE[et\Bt] + \Bt, Bt > 
9tL Bt = 0. 



G{xuBt) = 



(18) 



Note that the minimum cost jC^ converges to minp C{V) 
as A^, M oo provided that the grid points are suitably 
chosen [29], [33]. The optimal policy can be computed 
efficiently using policy iteration [32]. The analysis of V'^ 
is made tractable by considering a discounted-cost problem. 
Specifically, given a discount factor p G (0, 1) and the initial 
state xo, a stationary feedback-control policy : ^ ^ B is 
designed by minimizing the discounted cost function 



(19) 



The optimal policy and minimum cost converge to 
their average-cost counterparts as: = lim^^i V'^ and = 
limp^i(l — p)V*(xo) for arbitrary [32]. 

The discounted-cost problem allows simpler analysis as 
satisfies the following Bellman's equation: 



v;{xt) = vv;{xt), \/xt 



(20) 



where F is the dynamic-programming operator and defined for 
a given function g : ^ R as 



min {G(xt, B) + pE [q{xt+i) \xt,B]} . 



(21) 



(22) 



Though solving Bellman's equation analytically is infeasible, 
we can derive from this equation some properties of the opti- 
mal policy as follows. Several auxiliary results are obtained as 
shown in the following two lemmas. First, the mono tonicity 
of Vp(x) depends on if the following function is negative or 
nonnegative 

f{gk, Si) = V^(^/c, di) - V;(^/c, Si-i)- 

with V^{gk, Si) = if either k = or £ = 0. 

Lemma 1. The function f{x) is nonnegative for all x e ^. 

Proof: The proof uses the value iteration, namely that 
for an arbitrary function g : ^ ^ R, the minimum discounted 
cost is [32] 

V;{xt) = lim f^q{xt). (23) 

We show that if q is chosen to have the property in the lenmia 
statement, this property also holds for Fq or in other words, 
remains unchanged by the dynamic-programming operation. 



Combining this fact and the value iteration in (23) proves the 
lemma. The details are provided in Appendix A. ■ 
Lenmia 1 shows that f{g,S) is a monotone increasing 
function of {g,S) G ^. Next, define the function 

Z{xu B) = G{xu B) + pE [V;{xt+i) \ x^ B] . (24) 

Given the relation 

V;{xt) = ^TgmmZ{xuB), (25) 

the structure of depends directly on the characteristics of 
Z, which are specified in the following lemma. 

Lemma 2. Z{xt,B) has the following properties. 

1) With xt fixed and for 5 G B and B 0, Z{xt, B) is a 
monotone decreasing and convex function of B; 

2) With B fixed, Z{gt,St,B) is a monotone increasing 
function of gt and also of if ^ = 0» and of gt and 
independent with St if B > 0. 

The proof is provided in Appendix B. Using Lemma 1 and 
2, the key result of this section is obtained as follows. 

Theorem 1. The optimal feedback-control policy V'^ has the 
following properties. 

1) If there exists (a, 6) G ^ such that V'^{a,h) = 0, 
P^(a, S) = for all S G f> and S <b. 

2) If there exists (a, 6) G ^ such that V'^{a,h) > 0, 
p^(a, S) = V{a, b) for all S e T> and S > b. 

3) If there exist (a, 6), (c, 6) G i' such that V''{a,b) > 
and V''(c, b) > 0, V''{c, b) > V''{a, b)ifc>a and vice 
versa. 

The proof is presented in Appendix C. The structure of 
as specified in Theorem 1 is illustrated in Fig. 3, from which 
is observed to be opportunistic in nature. CSI feedback 
over a particular feedback link is performed only when the 
corresponding CSIT error and/or interference channel gain 
are large. As a result, the optimal policy partitions the state 
space into the feedback and no-feedback regions similar to the 
on/off-feedback policy in [29]. The current policy that supports 
variable-rate feedback further partitions the feedback region 
into smaller regions and assigns them different numbers of 
feedback bits. Upon feedback, the number of feedback bits 
increases with the interference-channel gain. The CSIT error 
observed prior to feedback affects the decision on if feedback 
should be performed but has no influence on the number of 
feedback bits upon feedback. The reason is that the CSIT 
error after feedback is equal to the quantization error that is 
independent of CSIT error prior to feedback. 

Intuitively, the feedback-link should be turned off less 
frequently when the interference-channel gain is large. In 
other words, the feedback-threshold function separating the 
feedback and no-feedback regions should map larger values 
of g to smaller ones of S. However, proving this property 
requires more restrictive assumptions on the channel temporal 
correlation than the current ones. 

The feedback control can be treated as the dual of bit 
loading (or adaptive modulation) over forward data links [34], 
[35]. Both feedback control and bit loading opportunistically 
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B* = 




1 • - 




1— , = 



5 



for all m and n. The above two steps are repeated till the 
policy converges, namely = V^'^\ yielding the optimal 

feedback-control policy. As observed from simulation, the 
policy iteration converges typically within several iterations. 

V. The Optimal Feedback-Control Policy: High 
Mobility 

In this section, we focus on the regime of high mobility 
and derive more elaborate structural results for the optimal 
feedback-control policy by directly solving the optimization 
problem (15) rather than relying on dynamic programing. 



Fig. 3. The structure of the optimal feedback-control policy V'^ where 
< Bi < • • • < Ba-1 < Ba and {B^} G B. 



allocate (CSI or data) bits over (feedback or forward) channels 
based on instantaneous (interference or data) CSI. Further- 
more, both functions share the same objective of enhancing 
the system throughput. 

In wireless communication networks such as 3GPP-LTE, 
users are assigned dedicated (orthogonalized) feedback links. 
This approach incurs fast growing network overhead with the 
increasing popularity of cooperative transmission techniques 
such as multi-cell joint transmission [36] or interference 
alignment [18], for which the number of feedback links may 
increase quadratically with the number of users. Perhaps a 
more efficient approach is to allow multiple users to share a 
single feedback channel using e.g., a random access protocol. 
For this case, intelligent feedback control by receivers will 
alleviate feedback-traffic congestion and reduce the feedback 
delay for CSI that is time sensitive. 

C. The Computation of the Optimal Feedback-Control Policy 

The optimal feedback-control policy can be efficiently 
computed by policy iteration (see e.g., [32]). Each iteration 
involves policy evaluation and policy improvement. The step 
of policy evaluation in the i-th iteration is to compute the 
corresponding average reward conditioned on a given 
policy :p(^-i): 

PnA^^'~^H9mX)). yrn.n (26) 
Ui^l = (27) 

where |Z//m^n| represents a set of scalars called differential 
rewards and the constraint in (27) ensures that the solution of 
(26) is unique. In the ensuing step of policy improvement, a 
new policy V^'^'^ is computed using and |zYm^n| obtained 
by solving (26) and (27): 



G{gm:Sn:B)-\- 



(28) 



A. The Structure of the Optimal Feedback-Control Policy 

To simplify the solution of (15), we consider the sphere- 
cap-quantized-CSI model in Example 1, resulting in the 
optimal feedback-control policy of the water-filling type as 
shown in the sequel. This property is expected to also hold 
for the random-vector quantization in Example 2 since the 
quantization-error expectations for both models have similar 
exponential forms (compare (4) and (6)). 

Given independent block fading and a stationary feedback- 
control policy, the optimal feedback decisions in different slots 
are made independently. Consequently, {gt^St^Bt) have sta- 
tionary distributions and are i.i.d. in different slots. To simplify 
notation, let (^, B) represent a sample of {gt^ ^t^Bt} in an 
arbitrary slot. Using this notation and (4), (15) can be rewritten 
as follows: 



minimize: E 

B 



^mm 1 — - — 2 



subject to : E [5] < 

5 G B 



(29) 



K-1 



where the min operator in the objective function accounts for 
the fact that feedback from a receiver to a particular interferer 
should be performed only if it reduces the expected CSIT 
error. Solving the problem in (29) analytically is difficult due 
to the constraint B e B. To overcome this difficulty, the 
constraint B e B is relaxed as B > which approximates 
the case where many quantization resolutions are supportable. 
The above optimization problem is modified accordingly as: 

gmm 



minimize: 

B 



1, 



subject to : E [5] < 
B>0. 



(30) 



K-1 



Solving the above problem yields the structure of the optimal 
feedback-control policy as described in the following propo- 
sition. 

Proposition 1. For high mobility, the optimal feedback- 
control policy : A' ^ 1R.+ resulting from solving (30) 
is of the water-filling type: 

T-(L-l)log2-, 6>^{g) 



(31) 



0, 



otherwise 
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where T is the water level given as 

b 



The feedback-threshold function ^ : 
following optimization problem: ^ 

minimize: P(^) 
subject to: T(^) - (L - 1) log2 



log2 - I ^ > ^{g) 



V solves the 



1 



(L-l)log, 



^-1(1) 
L-l 



> 



(32) 



^2 L^-^l). 

where 7*(^) given below is the sum-interference power at any 
receiver achieved by given ^ 

1 



r = -2-^ Fr{6 > ^{g)y 

1 



(33) 



L-l 

^'_-E[gS\S<^i^{g)]. 

In addition, ^(^) is a monotone decreasing function of g. 

The proof is provided in Appendix D. The above policy 
structure is consistent with that of the general solution as 
described by Theorem 1 and its remarks. Moreover, the 
optimal feedback-control for high mobility is similar to the 
classic adaptive modulation algorithm that allocates data bits 
in time also based on water-filling [34]. 

For a large average feedback rate b:^ 1, Pr(^ > ^(^)) ~ 1 
and thus the minimum average sum-interference power at an 
arbitrary receiver follows from (33) as 

1 ■ T 



L- 

c2~ 



1 



{K-1){L-1) 



(34) 



where c is a constant. It can be observed from (34) that P 
decreases exponentially with increasing 6, where the slope is 
smaller for a larger number of links or transmit antennas per 
transmitter. In addition, the optimal number of feedback bits 
given in (31) needs to be rounded to the nearest and smaller 
integer for implementation and this operation increases by 
a multiplicative factor no larger than 2~ ^ 

It is infeasible to obtain the feedback-threshold function ^ 
analytically by solving the optimization problem in Proposi- 
tion 1. Thus computing ^ requires a numerical search, which 
is used to obtain relevant simulation results in Section VI. 

Finally, we obtain some insight into the effect of quantizing 
direct-link feedback (feedback from a receiver to the intended 
transmitter) and justify its omission in the performance metric. 
Consider an arbitrary data link in the current MISO inter- 
ference channel, where the transmit beamformer, channel- 
direction vector, and received interference power are denoted 
as fo, So and /q, respectively. The direct-link feedback of the 
quantized version sq of sq allows the transmitter to perform 
the maximum-ratio transmission under the constraint of zero- 
forcing beamforming [37]. As a result, the corresponding 
effective channel gain after beamforming can be shown to be 

^The operator (a)+ for a G H gives a if a > or otherwise 0. 



ip{l — Q where Lp follows the chi-square distribution with 
2{L — K degrees of freedom and C is no larger than the 
quantization error eo of sq, where eo = 1 — IsqSoP [37]. For a 
high SINR and small C, throughput R of the considered data 
link can be approximated as follows 



R 



^0 J 

E[C]-E[log2/o 



E[log2 ^] 



> 



-E[eo]-log2E[/o] + E[log2 



(35) 



Assuming that sq is generated by a random vector quantizer, 
it follows from (6) that E[eo] ~ ae~^^^^ where Bq denotes 
the number of direct-link- feedback bits. Moreover, for high 
mobility and a large cooperative feedback rate, E[/o] can be 
approximated by given in (34). Then it follows from (34) 
and (35) that 



R > 



(K-1)(L-1) 



constant. (36) 



The first term at the right-hand side of (36) represents the 
throughput loss due to the direct-link-feedback error and 
the second the throughput gain obtained by increasing the 
cooperative feedback rate. It can be observed that the effect 
of the direct-link-feedback error diminishes exponentially with 
^0 and hence omitted in the current analysis. 

B. Extension to Asymmetric Channel Distributions 

In the preceding sections, all interference channels are 
assumed to follow identical distributions. In this section, we 
discuss feedback control for asymmetric interference channel 
distributions in terms of heterogeneous path losses and as- 
suming high mobility for mathematical tractability. Let d\'^'^^ 
denote the distance between receiver m and transmitter n. The 
average interference power at receiver m can be written as 



a] ^ —0(.j[mn\ 



(37) 



where a is the path-loss exponent and 

j[mn\ 



1 



mn, 

gi ' mm 



in 



B 



(38) 

Given heterogeneous path losses, the uniform allocation of 
average feedback rates by each receiver to different feedback 
channels is no longer optimal. Consequently, we should op- 
timize the average feedback-rate allocation besides feedback- 
control over time. Specifically, the feedback-control optimiza- 
tion problem can be decomposed as: 

- Master problem (average feedback-rate allocation) 



minimize: 

{bm,n} 



subject to: 



K 

E 

n=l 
K 

n=l 



F[mn] 



,n<b 

> V m ^ n 



(39) 
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where (^m,n) solves the following sub-problem. 



- Sub-problem (stochastic feedback control) 

f[mn] /T> ^ 



mmimize: 



subject to: E 



(40) 



where /[^^] is given in (38) and Vmn denotes the 
stationary policy for controlling the feedback link from 
receiver m to transnutter n. 
Note that the sub-problem is identical to (15) except for the 
difference in the maximum average feedback rates. The above 
decomposed optimization problems have an unique solution 
as shown below. 

Lemma 3. /j^n ^ (pm,n) is a convex and monotone decreasing 
function over ^^^n 

> 0. 

The proof is presented in Appendix E. The following result 
holds given the convexity of the master problem as a result 
of Lenmia 3 and that of the sub-problem follows from the 
discussion in Section V. 

Proposition 2. Solving the master problem and sub-problem 
gives an unique optimal stationary feedback-control policy. 

Next, we characterize the optimal feedback-control policy 
based on the quantizer model in Example 1 and for a large 
average sum-feedback rate per user. For this case, using 
(34) and given hm,n^ the average interference power from 
transmitter n to receiver m can be approximated as 

JM ~ — J— -2"^[^°S2 i]2-fef (41) 



L-l 

where g follows the chi-square distribution. This approxima- 
tion reduces the master problem as: 

K 



minimize: 



subject to: 



=1 

K 

E 

n=l 

n^ra 



bm,n > V m 7^ n. 

Solving the above constrained optimization problem using La- 
grangian method yields that the optimal allocation of average 
feedback rates is of the water-filling type: 



r7-a(L- l)log2d'™"l, n m 



where rj is the water-level given as 

b a{L-l) 



K-l 



K-l 



(42) 



(43) 



For a sanity check, the substitution of equal distances d\'^^^ = 
^[m2] ^ . . . = into (42) gives equal-rate splitting: 

^m,n = n ^ m. It can be observed from (42) 

that the optimal average feedback rate allocated by a receiver 
for suppressing the interference-power of a particular interferer 
decreases logarithmically with the increasing distance between 
the interferer and the receiver. Relaxing the integer constraint 



on the numbers of feedback bits and combining (31) and (42), 
we can approximate the optimal number of feedback bits B^^^ 
sent from receiver m to transmitter n with n ^ m as 



:r7'-a(L-l)log2d['""l-(L-l)log2 



1 



(44) 



where rj' is a constant. The above expression shows two- 
tier water-filling for allocating average feedback rates over 
multiple feedback links and for each link distributing feedback 
bits over different slots. 

The feedback scheme in (44) is similar to those in [26], [27] 
in that the optimal number of feedback bits for a particular 
feedback link increases logarithmically with the channel gain 
of the corresponding forward link, despite the differences 
in settings (interference networks, cooperative multi-cell net- 
works [26], or multiuser downlink systems [27]) and metrics 
(sum interference power, throughput loss [26] or total trans- 
mission power [27]). The fundamental reason for the above 
similarity is that different performance optimization problems 
can be reduced to or approximated by one that minimizes a 
weighted sum of exponential functions of numbers of feedback 
bits under a constraint on the sum-feedback rate. 

VI. Simulation Results 

The simulation has the following settings unless specified 
otherwise. The number of antennas L = 4, the number of 
users K = 3, and the set of available numbers of feedback 
bits isB = {2n | < n < 15}. All channel fading coefficients 
are modeled as i.i.d. CAr(0, 1) Gaussian processes. For low- 
to-moderate mobility, the temporal correlation of each process 
is specified by Clark's function [38]. The values of Doppler 
frequency are normalized by the symbol rate. The state space 
for feedback control at low-to-moderate mobility is discretized 
to have M = 16 grid points for the interference-channel gain 
and TV = 16 points for the CSIT error. The set Q is chosen 
based on the equal-probability criterion such that Pr(^/e < 
< ^^+1) = ^ for 1 < < M, ^1 = and ^m+i = 00. 
The CSI quantization error is generated based on the sphere- 
cap-quantized-CSI model in Example 1. Correspondingly, the 
grid points for the CSIT error are chosen to be the expected 
quantization errors for different numbers of feedback bits in 
B, namely V = |^2"^ | 5 G b|. 

Fig. 4 to 6 concern stochastic feedback control for low-to- 
moderate mobility. Fig. 4 shows the optimal feedback-control 
policies computed using policy iteration for different combina- 
tions of (normalized) Doppler frequency fd and average sum- 
feedback rates b. Both Fig. 4(a) and 4(b) are consistent with 
Theorem 1. Specifically, it can be observed from the figures 
that given the optimal policy, the state space is partitioned 
into the feedback and no feedback regions. Moreover, in the 
feedback region, B* is independent of the CSIT error (5; given 
5, B^ is a monotone non-decreasing function of g. Comparing 
Fig. 4(a) and 4(b), increasing Doppler frequency and the 
average sum-feedback rate enlarge the feedback region as well 
as the numbers of feedback bits in the feedback region. 

Fig. 5 shows the throughput-per-user versus transmit SNR 
for optimally controlled feedback given b = 12 bit/slot and 
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Interference-channel gain 

(index) 15 



Transmit CSI error 
(index) 



(a) fd = 10-2 and b = 12 bit/slot 




Interference channel gain 

(index) 15 



Transmit CSI error 
(index) 



(b) fd = 6x 10-3 and 6 = 36 bit/slot 

Fig. 4. Optimal feedback- control policies given average sum-feedback 
constraints and a discrete state space 

for conventional feedback algorithms with a sum feedback 
constraint of 16 bit/slot, where the additional 4 bit/slot ac- 
counts for the extra feedback-control overhead for specifying a 
varying number of feedback bits. For comparison, two existing 
feedback methods are considered, namely simple feedback for 
which CSI in each slot is quantized with a fixed resolution (8 
bits) and feedback is performed in each slot (see e.g., [2]) and 
differential feedback that exploits channel temporal correlation 
for feedback reduction (see e.g., [5]). The different-feedback 
algorithm considered here is from [5] and allows transmitter 
n to construct the channel direction s[^"^^ using the past CSI 
-[^n] ^ quantized L x L unitary matrix a[^^^ sent by 
receiver m as follows: 



1 



mn]\ ^[mn] 



(45) 



where < < 1 is adapted to Doppler frequency by a 
numerical search using the criterion of maximum throughput 
and a[^"^^ is chosen from a 8-bit random codebook of i.i.d. 
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Fig. 5. Throughput-per-user versus transmit SNR for low-to-moderate 
mobility and different limited-feedback techniques under an average sum- 
feedback constraint per user of 16 bit/slot. 
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Fig. 6. Throughput-per-user versus average sum-feedback rate for different 
limited-feedback techniques with low-to-moderate mobility and the transmit 
SNR equal to 13 dB. 



entries such that the CSI error is minimized. From Fig. 5, the 
throughput-per-user for both simple and differential feedback 
is observed to saturate as the transmit SNR increases and 
residual interference becomes dominant over noise. The use 
of feedback control alleviates this performance degradation 
and increases the throughput-per-user significantly especial at 
high SNRs. Moreover, the throughput-per-user given feedback 
control increases rapidly as the Doppler frequency decreases, 
corresponding to growing redundancy in CSI. Specifically, re- 
ducing from Ixl0~^to2xl0~^ increases the throughput- 
per-user by up to about 2 bit/s/Hz. 

Fig. 6 shows the throughput-per-user versus average sum- 
feedback rate per user for both controlled feedback and con- 
ventional feedback methods, where the additional controlled- 
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Fig. 7. Throughput-per-user versus average sum-feedback rate for the optimal 
feedback control with high mobility. All interference channels have unit 
propagation distances for the case of symmetric channel distributions. For the 
asymmetric case, the (K — 1) = 2 interferers for each receiver are located 
at distances of 1 and 3 units away. The path-loss exponent is o; = 3 and 
transmit SNR 13 dB. 

feedback overhead mentioned earlier has been accounted for. 
It can be observed that as the average sum-feedback rate 
increases, the throughput-per-user for the optimally controlled 
feedback converges to the upper bound corresponding to 
perfect CSIT faster than that for differential feedback and 
much more rapidly than that for simple feedback. Conse- 
quently, given the same average sum-feedback constraint, the 
optimal feedback control yields higher throughput than the 
two conventional methods. It can be observed that exploiting 
the channel temporal correlation by either feedback control or 
differential feedback can provide significant throughput gains. 
For example, feedback control increases the throughput-per- 
user of simple feedback by about 3 times given the average 
sum-feedback rate of 7 bit/slot and fd = 2xl0~^. Last, note 
that the humps on the curves for controlled feedback are due 
to discretization of the state space. 

Finally, we consider the optimal feedback control for high 
mobility. Fig. 7 displays the curves of throughput-per-user 
versus average sum-feedback rate per user for controlled 
feedback as well as no feedback control, namely that the rates 
for different feedback links are equal and simple feedback is 
applied. These results are based on IB G IN+, aligned with 
the analysis in Section V. It is observed that the throughput 
gain of the optimal feedback control with respect to the case 
of no feedback control is marginal given symmetric channel 
distributions and high mobility, namely no redundancy in 
CSI. However, this gain is significant in the presence of 
asymmetric channel distributions and unequal distribution of 
average feedback rates over different feedback links. 

VII. Conclusion 

This work has proposed the new approach of distributive 
and stochastic control of event-driven CSI feedback in multi- 
antenna interference networks. For symmetric channel distri- 



butions, the optimal feedback-control policy for each feedback 
link has been proved to be opportunistic. Specifically, feedback 
is performed only if the corresponding interference-channel 
gain is large or the CSI at the transmitter is significantly 
outdated; the number of feedback bits increases with the 
interference-channel gain. For high-mobility and symmetric 
channel distributions, by considering a specific CSI quantiza- 
tion model, the optimal feedback policy has been shown to be 
of the water-filling type that also has the above opportunistic 
properties. For high-mobility and heterogeneous path-losses 
for the interference channels, the optimization of the feedback 
controller has been decomposed into a master problem and 
a sub-problem. We have proved the existence of an unique 
solution for the decomposed optimization problems. 

To the best of our knowledge, this is the first work on 
applying stochastic-optimization theory to design feedback 
controllers in multi-antenna interference networks. This work 
opens several issues for future investigation. First, in the case 
of bursty traffic, the queues and feedback-links can be jointly 
controlled to achieve the optimal tradeoff between transmis- 
sion delay and feedback overhead. Second, the event-driven 
feedback targets shared feedback channels where feedback 
collisions are inevitable. Collisions and the resultant feedback 
delay are omitted in the current work but important issues 
to consider in designing practical feedback controllers and 
protocols. Last, it is challenging to generalize the current 
feedback-controller designs to more complex settings such 
as MIMO channels and spatial multiplexing, and alternative 
beamforming algorithms such as one using the minimum- 
mean- square-error criterion. 
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Appendix 

A. Proof for Lemma 1 

For ease of notation, define a function ^{gk \ ^5, 5*) as 

^{gk\h,B^)=E [v;(^t+i=^fe,^t+i)|^t = 4,5"l . (46) 
We can write that 

N 

^{gu I h^B^) = Y,V;{gkX)PhAB^) 

n=l 

N 

N 



with Vp{gk,So) =0. Similarly, from (46), we can obtain that 

E[V;(^t+l,^t+l) I =^a,^t = 4,^1 
= E[^(^t+i I Sb.B'') I gt =ga] 



M 



M 



J2 [HSk I 4,5") - Hgk-i I 4,5")] ^^Pa,^ 

k=l i=k 
N M N M 

EE/(^-'^^) E A,m(5*)EPa,£ (47) 



n=l k^l 



£=k 



with ^(^0 I Sb.B'') = 0. Using (18), (20), and (21), it can be 
obtained for B"" = that 

fVpiSaJb) = SaSb + E [v;(^t+i,^t+i) \gt = gaJt = Sb\ 

N M N M 



n=l k=l 



(48) 



where (48) uses (47). It follows from (22) and (48) that 

F/(^a,4) = ^v;{gkM - Fv;(^,,4-i)- 

fV;{gk-i^Si) + FV;{gk-iJi-i) (49) 

N M 

= iSa -ga-l){h -4-l) + X^X^/(4,^/c)x 

N N 



n=l k=l 



n,m(0) - Y Pb-l,m{0) 

m=n 
- M M 



i=k 



i=k 



(50) 
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with ^0 = ^^0 = 0. Given Assumption 2, (50) yields that 
ffigj) > for all {gj) if f{gj) > for all {gj) and 
B"" = 0. Next, from (18), (20), and (21), it can be obtained 
for 5* > that 



FV;(^a,4)=^aE[e|5T+5*- 



E[v;{gt+i,St+i)\gt = ga.B-]. 



(51) 



It can be observed from (51) that FVp(^a:^6) is independent 
with ^5. Thus, using this fact and (49) gives that ff{g^S) =0 
for > 0. By combining above results, we conclude that 
the policy iteration retains the property f{g,S) > if its 
initialization has such a property (e.g., f{g,S) = 1 for all 
{g^S)). This completes the proof. 

B. Proof for Lemma 2 

Using (18), (24) and (47), it can be obtained that 

Z{a,b,B) = 

N M 



n=l k=l 

N M 



B>0 



m=n 

N M 



N 



M 



^6 + ^^/(4,^^) ^ n,m(0)^Pa/, B = 0. 



n=l k=l 



£=k 



(52) 



Given Assumption 1 and 3 and using Lemma 1, Property 1) in 
the lemma statement holds since Z(a, 6, 5) is a nonnegative 
combination of monotone decreasing and convex functions of 
B as can be observed from (52). Property 2) follows from 
Assumption 2, Lenmia 1 and (52). 

C. Proof for Theorem 1 

Since ^ V'^ sls p ^ 1, it is sufficient to prove that 
given an arbitrary p G (0, 1), has the properties of as 
described in the theorem statement. Assume that there exists 
(a, 6) G i' such that V^{a, h) = 0. Using (20) and (24), (a, h) 
satisfies the following condition 



Z(a,6,0) < mmZ(a,6,5). 



(53) 



It follows from Assumption 2 and (52) that with a fixed, 
Z(a, 6, 0) is a monotone increasing function of h. As a result, 
we obtain from (53) that 



Z(a,^,0) < mmZ(a,6,B), M 5 <h 
= minZ(a, 5, B) 



(54) 



where (54) holds since Z(a, B) can be observed from (52) 
to be independent with 6 if 5 7^ 0. Property 1) in the theorem 
statement is proved by combining (25) and (54). Next, assume 
that there exists (a, 6) G X such that Vp{a,h) > 0. This 
implies that Vp{a^6) > for all S > b since otherwise 



Vp{a,b) = based on Property 1, which violates the earlier 
assumption. Therefore, 



Vp{a,6) = argmmZ(a,^,5), W 6>b 
= arg min Z(a, b, B) 



(55) 



where (55) results from the equality in (54). Property 2) in 
the theorem statement follows from (25) and (55). 

Last, assume that there exist (^o, ^5), (^c, 4) ^ ^ such that 
Qa < Sc, Vp{ga, 4) > 0, and Vp{gc, 4) > 0. To facilitate the 
proof, we arrange the elements of B in the ascending order: 
B = ^2, • • • , Ba} with Bi < B2'" < Ba Sind A = 
|B|. Moreover, given x e define the differences 

A+Z(f , Bu) = Z{x, Bu) - Z{x, Bu+i) (56) 
with 1 <u < A and 

A-Z(f , Bu) = Z{x, Bu) - Z{x, Bu-i) (57) 
with 1 <u < A. The substitution of (52) into (56) gives 
A-^Z{ga,Sb,Bu) = Sa {E[e | B^] - E[e | + 



N M 



M 



n=l k=l 

N 

E 



i=k 



N 



(58) 



Pb,m{Bu-\-l) 



It follows that 

A+Z(^„ 4, Bu) - A+^(^c, 4, Bu) 

= {ga - Sc) {E[e I Bu] - E[e | + 



N M 



n=l k=l 



M 



M 



N 



N 



Pb,m{Bu) — Pb,m{Bu-\-l) 



< 



(59) 



where (59) is obtained using Assumption 1 and 2, and 
Lenmia 1. Similarly, it can be shown that 

A-Z{ga, 4, B^) - A-Z{g,, 4, B^) > 0. (60) 

By replacing B^ in (59) and (60) with V^p{ga,h), 

A+Z(^e,4,P;(^a,4)) > A+Z(^,,4,P;(^a,4)) (61) 
A-Z(^e,4,P;(^a,4)) < A-Z(^„4,P;(^a,4)). (62) 

For B G B and 5 > 0, Z{x^B) with x fixed is a convex 
function of B according to Lemma 2 and from (25) Vp{x) 
minimizes Z(x^B) over B. Consequently, 

for any B < Vp{x), 

A+Z{x,V;{x))>0, 
and for any B > Vp{x), 

A+Z{x,V;{x))<0, 



A-Z{x,V;{x))<0, (63) 
A-Z{x,V;{x))<0, (64) 
A-Z{x,P;{x))>0. (65) 
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It can be concluded that Vp{gc,h) > Vp{ga^h) by combin- 
ing (61), (62) and (63) and comparing the result with (64) 
and (65). This proves Property 3) in the theorem statement, 
completing the proof. 

D. Proof for Proposition 1 

We claim that there exists a threshold ^ \ g ^ 5 such that 
(30) is equivalent to the following optimization problem: 



mimnuze: 



B 
' L-1 



S > ^{g) 



subject to : E[5] < 
B>0 



(66) 



K 



and prove the claim as follows. Let denote the solution 
of (30). Given g and from (30), if there exists Sa ^ V such 



that ^2" 



L-1 Q- 
L ^ 



^-1 < S for all S > Sa', if there 
ySb^B"" satisfies 5* = 



exists Sb eV such that ^2 
for all S < Si). This proves the claim. 

The above optimization problem can be solved as follows. 
First, by neglecting the positivity constraint on B, the convex 
optimization problem in (66) can be solved using Lagrangian 
method [39]. The resultant policy is specified in (31). Next, 
^ is chosen to suppress the expected interference power as 
well as enforce two constraints in (30): i) feedback reduces 
the expected CSI error, namely ^^2"^^ < S if B > and 
ii) 5 > V (^, ^). It follows that the problem of optimizing ^ 
is as given in (32) with ^~^(1) replaced with min^^ ^~^(5). 
Next, it can be observed from (33) that given Pr(^ > "^{g)), 
minimizing P requires ^(^) to be a monotone decreasing 
function of g, proving the stated monotonicity of ^(^). As 
a result, min^^-^(J) = ^"^(1) and (32) follows. This 
completes the proof. 



E. Proof for Lemma 3 

Let P denote the space of the optimal feedback-control 
policies. Consider Ba^B^ G [j^V^{g\^^\s\^^^) for given 
^^[mn]^^[mn]^ G A'. Notc that ^ [e^ \ B] < 4"^"] if 
B e {Ba, Bb} and B > 0. Using this fact and for /i G [0, 1], 
the term q{B) = min (^E [et^^] | B] , S^!^""^^ in the objective 
function of the sub-problem is proved to be convex as follows 



^q{Ba) + (1 - ^)q{B,) 



liE 



\Ba 



-{l-fi)E 



> < 



I^E 
I (5^ 

E 
E 
E 



mn] 



\Ba 

Bb 
B„ 



+(i-/x)<5r 

+ (1 - ^)Bb 



Ba = 0,Bfc>0 

Ba>Q,Bb = 

Ba = 0,Bb = 
Ba>0,Bb>0 

Ba = 0,Bb>0 
Ba>0,Bb=0 
Ba = 0,Bb=0 



^[mn] 



B 



over 



where the inequality uses the convexity of E 
B as assumed in Assumption 1. It follows that 

fiq{Ba) + (1 - ^)q{B,) = q{^Ba + (1 - ^)B,) 

and hence q{B) is a convex function. 

Next, we prove that I^^^^ (x) is a convex function for x > 
using the sample-path method. Consider two average sum- 
feedback rates bx^by > 0. Let and Vy denote the optimal 
feedback-control policies that yield ij^n^ {bx) and (^y), 
respectively. Consider the sample paths j^'j^^^j and 
|^[mn]|^ {5f and {B^}^^ denote the sequences 
of numbers of feedback bits generated by Vx and Vy, re- 
spectively. Moreover, given /i G [0,1], define the sequence 
{Bf }~ 1 = }£i + (1 - M){5r}£i- Using the function 
q{B) defined earlier, we can write 



mn\ 
min 



{K) + (1 - 

T 

E 



mn\ 
min 



lim — E 

T^oo T 



1 



> lim — 

- T^oo T 



^=1 
■ T 

El 



L-1 
1 



g^r^ [M(5f) + (l-/i)g(5f)] 



1 



9t q{Bt, 



{libx + (1 - lj)by) 



(67) 



(68) 




where (67) uses the convexity of q{B) as proved earlier. The 
desired result follows from (68). 
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